Corpus: cym_wikipedia_2007_30K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 42 42 42 42 42
1000 143 164 174 338 346
10000 3601 5108 5768 6158 6231
100000 11750 20968 26289 28430 29117
1000000 11750 20968 26289 28430 29117


Zipf's diagram for sentence endings


Gnuplot diagram

3355 msec needed at 2017-12-05 13:15